Adding noop implementations for Sql persistence by johnsimons · Pull Request #5318 · Particular/ServiceControl

johnsimons · 2026-02-12T03:40:52Z

No description provided.

Refactors the upsert logic in several data stores to leverage EF Core's change tracking more efficiently. Instead of creating a new entity and then calling Update, the code now fetches the existing entity (if any) and modifies its properties directly. This reduces the overhead and potential issues associated with detached entities. The RecoverabilityIngestionUnitOfWork is also updated to use change tracking for FailedMessageEntity updates. This commit was made on the `john/more_interfaces` branch.

Adds data store and entities required for persisting licensing and throughput data. This includes adding new tables for licensing metadata, throughput endpoints, and daily throughput data, as well as configurations and a data store implementation to interact with these tables.

Also added headers to the serialised entity

Updates data stores to utilize IServiceScopeFactory instead of IServiceProvider for creating database scopes. This change improves dependency injection and resource management, ensuring proper scope lifecycle management, especially for asynchronous operations.

Adds full-text search capabilities for error messages, allowing users to search within message headers and, optionally, the message body. Introduces an interface for full-text search providers to abstract the database-specific implementation. Stores small message bodies inline for faster retrieval and populates a searchable text field from headers and the message body. Adds configuration option to set the maximum body size to store inline.

Removes the `internal` keyword from the `RecoverabilityJsonContext` class. This change allows the class to be accessible from other assemblies, potentially needed for serialization/deserialization scenarios outside the current assembly.

Stores message bodies to disk in parallel to improve ingestion performance. Instead of awaiting the completion of each write operation, it queues them, allowing multiple write tasks to run concurrently. It then awaits all tasks before saving the changes to the database.

Updates the configuration to no longer default the message body storage path to a location under `CommonApplicationData`. The path will now be empty by default. This change allows users to explicitly configure the storage location, preventing potential issues with default locations.

Refactors the Azure Blob Storage persistence to streamline its configuration. It removes the direct instantiation of BlobContainerClient within the base class and instead, registers the AzureBlobBodyStoragePersistence class for dependency injection, allowing the constructor to handle the BlobContainerClient creation. Additionally, it ensures that the ContentType metadata stored in Azure Blob Storage is properly encoded and decoded to handle special characters. Also, it adds MessageBodyStorageConnectionStringKey to the configuration keys for both PostgreSQL and SQL Server.

Implements data retention policy for audit messages and saga snapshots using a background service. This change introduces a base `RetentionCleaner` class that handles the logic for deleting expired audit data in batches. Database-specific implementations are provided for SQL Server and PostgreSQL, leveraging their respective locking mechanisms (sp_getapplock and advisory locks) to prevent concurrent executions of the cleanup process. Removes the registration of the `RetentionCleaner` from the base class and registers it on specific implementations. The cleanup process deletes processed messages and saga snapshots older than the configured retention period, optimizing database space and improving query performance.

Wraps retention cleanup process in an execution strategy to handle transient database errors. Moves lock check to inside the execution strategy, and only logs success if the lock was acquired.

Resets the total deleted messages and snapshots counters, as well as the lockAcquired flag, on each retry attempt of the retention cleaner process. This prevents accumulation of values across retries when the execution strategy is used. Also, updates lock acquisition logic to use `AsAsyncEnumerable()` to prevent errors caused by non-composable SQL in `SqlQueryRaw` calls.

Adds metrics to monitor the retention cleanup process. This includes metrics for cleanup cycle duration, batch duration, deleted messages, skipped locks, and consecutive failures. These metrics provide insights into the performance and health of the retention cleanup process, allowing for better monitoring and troubleshooting.

Introduces ingestion throttling during retention cleanup to reduce contention. This change adds an `IngestionThrottleState` to manage the throttling. The retention cleaner now signals when cleanup starts and ends, and the audit ingestion process respects the current writer limit. A new `RetentionCleanupBatchDelay` setting is introduced to add a delay between processing batches of messages. Adds a capacity metric to monitor the current ingestion capacity.

Corrects an issue where endpoint reconciliation could lead to incorrect "LastSeen" values when endpoints are deleted and re-added. The previous implementation aggregated LastSeen values across all deleted records, potentially resulting in an outdated value being used. This change introduces a ranking mechanism to select the most recent LastSeen value for each KnownEndpointId during reconciliation. This ensures that the latest LastSeen value is used, improving the accuracy of endpoint activity tracking.

Ensures distinct message IDs are deleted during retention cleanup. Adjusts the loop condition to continue deleting messages as long as the number of deleted items is greater than or equal to the batch size. This prevents premature termination of the cleanup process when a batch returns exactly the batch size, ensuring all eligible messages are removed.

Refactors the audit retention cleanup process to ensure reliability and prevent race conditions. It achieves this by: - Using session-level locks to maintain lock ownership across transactions, preventing premature lock release. - Encapsulating the entire cleanup process within a single lock, simplifying retry logic and ensuring all operations are executed by the same instance. - Wrapping each batch deletion in its own execution strategy and transaction to handle transient errors and maintain data consistency.

Implements table partitioning based on the ProcessedAt timestamp. This change introduces table partitioning for both ProcessedMessages and SagaSnapshots tables to improve retention cleanup performance and manageability. Additionally, the change stores message bodies in date-based folders. Removes progressive ingestion throttling.

Prevents potential modification of the connection string by storing it in a read-only field. This enhances thread safety and data integrity, especially in concurrent scenarios like retention cleanup.

Adds a unique non-clustered index on the Id column of the ProcessedMessages table to be used as key index for the full-text index. Updates the creation of the full-text index on ProcessedMessages to use the newly created index. This change is required because the full-text index needs a unique key index.

The retention period is rounded up to the nearest whole day to ensure that partitioning by day functions correctly and that no audit data is prematurely deleted. Since partitions are daily, only whole days should be used for retention period calculations.

Changes the partitioning scheme from daily to hourly. This provides more granular data management and improves query performance. The `ProcessedAt` timestamp is replaced with `CreatedOn`, truncated to the hour, as the partition key in both `ProcessedMessages` and `SagaSnapshots` tables. The body storage is also changed to store data in hourly folders.

Replaces partition truncation with a switch operation for greater compatibility with indexes. Aligns the `TimeSent` index with the partition scheme to improve query performance via partition elimination. Increases the database command timeout during migration to prevent failures on large databases.

Ensures that the staging table has a matching clustered index. This is a requirement for the `SWITCH PARTITION` operation to work correctly. The `SELECT INTO` statement copies columns but not indexes, so the index must be created explicitly.

Implements hourly partitioning for processed messages and saga snapshots in SQL Server to improve query performance and manage data retention. Introduces a partition function and scheme based on the 'CreatedOn' timestamp, and migrates existing tables onto this scheme. The partition manager handles boundary creation at runtime. Also, full-text search capabilities are added to the searchable content column for PostgreSQL and SQL Server. The staging table approach for partition truncation is replaced with a direct DELETE statement using partition elimination for SQL Server.

Increases the command timeout to 5 minutes for partition management operations. This change addresses potential timeout issues during partition cleanup, especially with the introduction of smaller, more frequent partitions. Long running partition delete operations might exceed the default timeout.

Ensures data ingestion is paused during retention cleanup to prevent conflicts and data corruption. Sets a flag to prevent more data ingestion while partitions are being dropped to avoid race conditions.

Improves performance of hourly partition deletion by batching the deletion of records. This avoids potential timeouts and increases the efficiency of deleting large numbers of records.

Removes the 'CreatedOn' column from the composite indexes on the ProcessedMessages and SagaSnapshots tables. This change centralizes the partitioning logic within the database itself, leading to improved query performance and simplified index management.

Ensures correct handling of nullable DateTime values returned from the SQL query, preventing potential errors when determining the oldest hour with data. This change allows the system to gracefully handle cases where the database might not have any data, returning a null DateTime value which is then processed correctly.

Adds indexes to the CreatedOn column in the ProcessedMessages and SagaSnapshots tables. SQL Server doesn't use native partitioning. These indexes improve query performance when using retention policies. Specifically addresses the MIN() queries used by the retention cleaner.

johnsimons added 30 commits December 15, 2025 17:20

Adding noop implementations for Sql persistence

ac0360e

Edding EF

26ed107

Introduce base EF Core persistence abstractions

eea31f5

Add core EF Core entities and configurations

f3fa401

Implement core EF Core data stores

f863b9e

Implement EF Core Unit of Work for ingestion

be396b1

Align persistence providers with new base abstractions

0a7b2ef

Update base DbContext and editorconfig

6a62aa1

Generate initial MySQL database migration

6e761e1

Generate initial PostgreSQL database migration

fed1ce4

Generate initial SQL Server database migration

ba7cbd9

Remove unused using statements

44859be

Remove statistics from failed message

e955cfc

Also added headers to the serialised entity

Use proper db types for json

eabf633

add plan for full text search

3db0152

Small fixes

0b41b7a

Adding migrations

bb584ae

make external integrations more efficient

9358ce0

improve message body storage

2096a57

Add missing config

7c9611d

Fix issue with concurrency

4f889ed

Adding FTS

47ac824

Adding EF to Audit

813420d

Adding EF to audit

dcd699e

Merge branch 'master' into john/audit_ef

aa37b2d

Removes internal keyword

fe11aff

Removes the `internal` keyword from the `RecoverabilityJsonContext` class. This change allows the class to be accessible from other assemblies, potentially needed for serialization/deserialization scenarios outside the current assembly.

johnsimons added 17 commits February 4, 2026 12:42

Ensures retention cleanup is resilient

3e4864f

Wraps retention cleanup process in an execution strategy to handle transient database errors. Moves lock check to inside the execution strategy, and only logs success if the lock was acquired.

Merge branch 'master' into john/audit_ef

02dbd51

Ensures connection string is read-only

c0581a9

Prevents potential modification of the connection string by storing it in a read-only field. This enhances thread safety and data integrity, especially in concurrent scenarios like retention cleanup.

internalautomation bot assigned johnsimons Feb 12, 2026

johnsimons added 5 commits February 12, 2026 13:49

Merge branch 'master' into john/partition_per_hour

f001b39

Adds clustered index to staging table.

ae4678e

Ensures that the staging table has a matching clustered index. This is a requirement for the `SWITCH PARTITION` operation to work correctly. The `SELECT INTO` statement copies columns but not indexes, so the index must be created explicitly.

johnsimons force-pushed the john/partition_per_hour branch from 5394c0d to d894e31 Compare February 13, 2026 04:52

Pauses ingestion during retention cleanup

a2e8327

Ensures data ingestion is paused during retention cleanup to prevent conflicts and data corruption. Sets a flag to prevent more data ingestion while partitions are being dropped to avoid race conditions.

johnsimons force-pushed the john/partition_per_hour branch from d894e31 to a2e8327 Compare February 13, 2026 05:01

johnsimons added 4 commits February 13, 2026 16:48

Improves partition deletion performance.

0f712c0

Improves performance of hourly partition deletion by batching the deletion of records. This avoids potential timeouts and increases the efficiency of deleting large numbers of records.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding noop implementations for Sql persistence#5318

Adding noop implementations for Sql persistence#5318
johnsimons wants to merge 76 commits intomasterfrom
john/partition_per_hour

johnsimons commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnsimons commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant